NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Automating methods for estimating metabolite volatility

https://doi.org/10.3389/fmicb.2023.1267234

Meredith, Laura K.; Ledford, S. Marshall; Riemer, Kristina; Geffre, Parker; Graves, Kelsey; Honeker, Linnea K.; LeBauer, David; Tfaily, Malak M.; Krechmer, Jordan (December 2023, Frontiers in Microbiology)

The volatility of metabolites can influence their biological roles and inform optimal methods for their detection. Yet, volatility information is not readily available for the large number of described metabolites, limiting the exploration of volatility as a fundamental trait of metabolites. Here, we adapted methods to estimate vapor pressure from the functional group composition of individual molecules (SIMPOL.1) to predict the gas-phase partitioning of compounds in different environments. We implemented these methods in a new open pipeline calledvolcalcthat uses chemoinformatic tools to automate these volatility estimates for all metabolites in an extensive and continuously updated pathway database: the Kyoto Encyclopedia of Genes and Genomes (KEGG) that connects metabolites, organisms, and reactions. We first benchmark the automated pipeline against a manually curated data set and show that the same category of volatility (e.g., nonvolatile, low, moderate, high) is predicted for 93% of compounds. We then demonstrate howvolcalcmight be used to generate and test hypotheses about the role of volatility in biological systems and organisms. Specifically, we estimate that 3.4 and 26.6% of compounds in KEGG have high volatility depending on the environment (soil vs. clean atmosphere, respectively) and that a core set of volatiles is shared among all domains of life (30%) with the largest proportion of kingdom-specific volatiles identified in bacteria. Withvolcalc, we lay a foundation for uncovering the role of the volatilome using an approach that is easily integrated with other bioinformatic pipelines and can be continually refined to consider additional dimensions to volatility. Thevolcalcpackage is an accessible tool to help design and test hypotheses on volatile metabolites and their unique roles in biological systems.
more » « less
Full Text Available
Predicting spring phenology in deciduous broadleaf forests: NEON phenology forecasting community challenge

https://doi.org/10.1016/j.agrformet.2023.109810

Wheeler, Kathryn I.; Dietze, Michael C.; LeBauer, David; Peters, Jody A.; Richardson, Andrew D.; Ross, Arun A.; Thomas, R Quinn; Zhu, Kai; Bhat, Uttam; Munch, Stephan; et al (February 2024, Agricultural and Forest Meteorology)

Full Text Available
Reviews and syntheses: The promise of big diverse soil data, moving current practices towards future potential

https://doi.org/10.5194/bg-19-3505-2022

Todd-Brown, Katherine E.; Abramoff, Rose Z.; Beem-Miller, Jeffrey; Blair, Hava K.; Earl, Stevan; Frederick, Kristen J.; Fuka, Daniel R.; Guevara Santamaria, Mario; Harden, Jennifer W.; Heckman, Katherine; et al (January 2022, Biogeosciences)

Abstract. In the age of big data, soil data are more available and richer than ever, but – outside of a few large soil survey resources – they remain largely unusable for informing soil management and understanding Earth system processes beyond the original study.Data science has promised a fully reusable research pipeline where data from past studies are used to contextualize new findings and reanalyzed for new insight.Yet synthesis projects encounter challenges at all steps of the data reuse pipeline, including unavailable data, labor-intensive transcription of datasets, incomplete metadata, and a lack of communication between collaborators.Here, using insights from a diversity of soil, data, and climate scientists, we summarize current practices in soil data synthesis across all stages of database creation: availability, input, harmonization, curation, and publication.We then suggest new soil-focused semantic tools to improve existing data pipelines, such as ontologies, vocabulary lists, and community practices.Our goal is to provide the soil data community with an overview of current practices in soil data and where we need to go to fully leverage big data to solve soil problems in the next century.
more » « less
Full Text Available
Ten simple rules to cultivate transdisciplinary collaboration in data science

https://doi.org/10.1371/journal.pcbi.1008879

Sahneh, Faryad; Balk, Meghan A.; Kisley, Marina; Chan, Chi-kwan; Fox, Mercury; Nord, Brian; Lyons, Eric; Swetnam, Tyson; Huppenkothen, Daniela; Sutherland, Will; et al (May 2021, PLOS Computational Biology)
Schwartz, Russell (Ed.)
Full Text Available
A reporting format for leaf-level gas exchange data and metadata

https://doi.org/10.1016/j.ecoinf.2021.101232

Ely, Kim S.; Rogers, Alistair; Agarwal, Deborah A.; Ainsworth, Elizabeth A.; Albert, Loren P.; Ali, Ashehad; Anderson, Jeremiah; Aspinwall, Michael J.; Bellasio, Chandra; Bernacchi, Carl; et al (March 2021, Ecological Informatics)
null (Ed.)
Full Text Available

Search for: All records